Skip to content

[hansen_singleton] Replace pandas-datareader with direct FRED / Fama-French downloads#925

Open
mmcky wants to merge 3 commits into
mainfrom
fix-hansen-pandas-datareader
Open

[hansen_singleton] Replace pandas-datareader with direct FRED / Fama-French downloads#925
mmcky wants to merge 3 commits into
mainfrom
fix-hansen-pandas-datareader

Conversation

@mmcky

@mmcky mmcky commented Jun 21, 2026

Copy link
Copy Markdown
Contributor

Closes #924.

Stacked on #926 — merge that first. This PR reads a vendored data snapshot that #926 adds to main. Until #926 lands, this PR's CI will fail (the …/main/… data URL 404s); it goes green once #926 is merged.

Problem

hansen_singleton_1982 and hansen_singleton_1983 fail to execute under anaconda 2026.06 / pandas 3.0 (surfaced by the forced full execution in #923). Both !pip install pandas-datareader and import it, but pandas-datareader 0.10.0 (unmaintained since 2021) relies on the private pandas API pandas.util._decorators.deprecate_kwarg, whose signature changed in pandas 3.0, so it dies at import:

TypeError: deprecate_kwarg() missing 1 required positional argument: 'new_arg_name'

Approach

Following discussion, instead of fetching from the data providers at build time, the data is vendored:

  • Add vendored data + maintenance scripts for hansen_singleton lectures #926 adds _static/lecture_specific/hansen_singleton_198{2,3}/ — a make_data.py maintenance script (builds the dataset from FRED + Ken French), the frozen *_data.csv snapshot, and a README.md.
  • This PR removes the pandas-datareader dependency (and the inline fetch helpers) and collapses each lecture's hidden data cell to a single pd.read_csv(<raw GitHub URL>) of that snapshot, selecting the columns it needs.

This matches the existing vendored-data convention used by mle, ols, and pandas_panel, keeps the build reproducible, and removes the live-fetch fragility (the flaky-network class that also bit ols).

Verification

Check Result
make_data.py output vs old pandas-datareader path (pandas 2.3.3) byte-identical FRED + Fama-French data
Lectures' code cells reading the vendored CSV, FutureWarning/DeprecationWarning → error run clean (pandas-3.0 safe)
Resulting frames 239 rows (1959-02 → 1978-12), expected columns, moments unchanged

Net effect on the lectures: −236 / +34 lines — the data machinery moves out to the maintenance scripts.

Note

The two ar1_* lectures that also fail under a forced run are a separate, pre-existing arviz issue, out of scope here.

…ench fetch

pandas-datareader 0.10.0 (last released 2021, unmaintained) breaks at
import under pandas 3.0 -- it relies on pandas' private deprecate_kwarg,
whose signature changed -- so hansen_singleton_1982 and
hansen_singleton_1983 fail to execute under anaconda 2026.06 (see #923).
There is no pandas-3.0-compatible pandas-datareader release to pin to.

Replace the two web.DataReader calls with small direct downloads that use
only the standard library + pandas:

- FRED: pd.read_csv from the fredgraph.csv endpoint
- Fama-French: parse the F-F_Research_Data_Factors zip from the Ken French
  data library

Since no extra package is needed, the in-notebook `!pip install
pandas-datareader` cell and the now-dead date_parser warnings filter are
removed too.

Verified the new fetch returns byte-identical FRED and Fama-French data to
the old pandas-datareader path on pandas 2.3.3, and that the full data
construction runs clean with FutureWarning/DeprecationWarning promoted to
errors (i.e. pandas-3.0 safe).

Closes #924

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings June 21, 2026 09:46

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the Hansen–Singleton 1982/1983 lecture notebooks to remove the runtime dependency on pandas-datareader (which is incompatible with pandas 3.0), replacing it with direct downloads from FRED (CSV endpoint) and the Ken French data library (zip + CSV parsing) using only the standard library and pandas.

Changes:

  • Removed the in-notebook !pip install pandas-datareader and the pandas_datareader import usage.
  • Added small in-notebook helpers to download/parse FRED series and monthly Fama–French factors directly.
  • Updated lecture text to reflect the new data sources (FRED + Ken French) and the direct-download approach.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
lectures/hansen_singleton_1982.md Replaces pandas-datareader-based FRED/Fama–French fetching with direct downloads and parsing.
lectures/hansen_singleton_1983.md Same migration as 1982 lecture, keeping the constructed estimation dataset consistent while avoiding pandas 3.0 breakage.

Comment thread lectures/hansen_singleton_1983.md Outdated
Comment thread lectures/hansen_singleton_1982.md Outdated
…line

Switch both lectures to read the pre-built monthly CSV from
_static/lecture_specific/hansen_singleton_198{2,3}/ (added in PR #926) via its
raw GitHub URL, replacing the inline FRED / Fama-French download helpers from
the previous commit. The data construction now lives in the per-lecture
make_data.py maintenance scripts; the lectures just read the frozen snapshot.

This keeps the build reproducible and off the live data endpoints, and still
removes the pandas-datareader dependency that breaks under pandas 3.0.

Depends on PR #926 (must land on main first so the raw URL resolves).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@mmcky

mmcky commented Jun 21, 2026

Copy link
Copy Markdown
Contributor Author

Updated to the vendored-data pattern discussed: the data + maintenance script + README now live in #926, and this PR just reads the snapshot from GitHub. Merge #926 first; this PR's CI will be red until then (the /main/ data URL 404s until #926 lands), after which I'll re-trigger and confirm green.

@github-actions

Copy link
Copy Markdown

📖 Netlify Preview Ready!

Preview URL: https://pr-925--sunny-cactus-210e3e.netlify.app

Commit: edcb718

📚 Changed Lectures


Build Info

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

Comment thread lectures/hansen_singleton_1983.md Outdated
Comment on lines +1450 to +1453
frame = pd.read_csv(DATA_URL, index_col=0, parse_dates=True)
start = pd.Timestamp(start).to_period("M").to_timestamp("M")
end = pd.Timestamp(end).to_period("M").to_timestamp("M")
return frame.loc[start:end]

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 135e0dd_data = pd.read_csv(DATA_URL, ...) is read once at cell scope and load_hs_monthly_data slices a copy of it. Verified the CSV is now fetched only once even though both get_estimation_data and get_tbill_estimation_data call it.

Comment thread lectures/hansen_singleton_1982.md Outdated
Comment on lines +1008 to +1011
frame = pd.read_csv(DATA_URL, index_col=0, parse_dates=True)
start = pd.Timestamp(start).to_period("M").to_timestamp("M")
end = pd.Timestamp(end).to_period("M").to_timestamp("M")
return frame.loc[start:end]

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 135e0dd — the vendored CSV is now read once into a cell-scope _data and load_hs_monthly_data returns a sliced .copy(), so repeated calls don't re-fetch or re-parse.

Read the snapshot once into a module-level _data and have load_hs_monthly_data
slice a copy of it, instead of re-downloading/parsing on every call. This
removes the redundant fetch in hansen_singleton_1983 (which loads via both
get_estimation_data and get_tbill_estimation_data). The .copy() keeps callers
from mutating the cached frame. Addresses Copilot review on PR #925.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
mmcky added a commit that referenced this pull request Jun 21, 2026
Wrap zipfile.ZipFile(...) in a `with` block so the archive is explicitly
closed, instead of leaving it to garbage collection. Pure refactor: both
scripts still reproduce byte-identical CSVs. Addresses Copilot review (raised
on PR #925, where this code previously lived).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

pandas 3.0 (anaconda 2026.06) breaks hansen_singleton lectures via pandas-datareader

2 participants